Management of Xml Data by Means of Schema Matching
نویسنده
چکیده
The eXtensible Markup Language (XML) has emerged as a de facto standard to represent and exchange information among various applications on the Web and within organizations due to XML’s inherent data self-describing capability and flexibility of organizing data. As a result, the number of available (heterogeneous) XML data is rapidly increasing, and the need for developing high-performance techniques to manage these data is vastly growing. A first step to manage these data is to identify and discover semantic correspondences across XML data. The process of identifying semantic correspondences among heterogeneous XML data is called XML schema matching. Schema matching in general plays a central role in several shared XML data applications, such as XML data integration, XML data migration, XML data clustering, peer-to-peer systems, etc. Therefore, myriads of matching algorithms have been proposed and many matching systems have been developed. However, most of these systems produce score schema elements, which results in discovering simple (one-to-one) matches. Such results solve the schema matching problem partially. In order to completely solve the problem, the matching system should discover complex matches as well as simple ones. Another dimension of schema matching that should be considered is matching scalability. Existing matching systems rely heavily either on rule-based approaches or on learner-based approaches. Rule-based systems represent schemas to be matched in a common data model, such as schema trees or schema graphs. Then, they apply their algorithms to the common data model, which in turn requires traversing schema trees (schema graphs) many times. By contrast, learning-based systems need much pre-match effort to train their learners. As a consequence, especially in large-scale schemas and dynamic environments, matching efficiency declines radically. As an attempt to improve matching efficiency, recent schema matching systems have been developed. However, they only consider simple matching. Therefore, discovering complex matching taking into account schema matching scalability against both a large number of schemas and large-scale schemas is considered a real challenge. This thesis proposes a new matching approach, called sequence-based schema matching, to identify and discover both simple and complex matches in the large-scale XML schema context. The approach is based on exploiting the Prüfer encoding method that constructs a one-to-one correspondence between schema trees and sequences. As a result of sequence-
منابع مشابه
Management of XML data by means of schema matching
XML Schema Definition is a recommendation from World Wide Web Consortium that specifies the elements All, News, Get Started, Evaluate, Manage, Problem Solve Consider niche tech XQuery to bring improvements to data integration. Georg Gottlob MASTER THESIS Schema Matching and Automatic Web Data can mean any model, for instance, an XML schema, interface definition, semantic management it became a ...
متن کاملAn Improved Semantic Schema Matching Approach
Schema matching is a critical step in many applications, such as data warehouse loading, Online Analytical Process (OLAP), Data mining, semantic web [2] and schema integration. This task is defined for finding the semantic correspondences between elements of two schemas. Recently, schema matching has found considerable interest in both research and practice. In this paper, we present a new impr...
متن کاملMatching of Ontologies with XML Schemas Using a Generic Metamodel
Schema matching is the task of automatically computing correspondences between schema elements. A multitude of schema matching approaches exists for various scenarios using syntactic, semantic, or instance information. The schema matching problem is aggravated by the fact that models to be matched are often represented in different modeling languages, e.g. OWL, XML Schema, or SQL DDL. Consequen...
متن کاملPLASMA: A Platform for Schema Matching and Management
This paper introduces an XML Schema management platform that promotes the use of matching techniques to fulfill the requirements of data integration and data exchange. The existing platforms, in the market, deal only with graphical but not automatic matching. Several matching algorithms were suggested, by different researchers, to automate the correspondences discovery between XML Schemas. Thes...
متن کاملSemantic Web Technologies and Data Management
The Semantic Web aims to build a common framework that allows data to be shared and reused across applications, enterprises, and community boundaries. It proposes to use RDF as a flexible data model and use ontology to represent data semantics. Currently, relational models and XML tree models are widely used to represent structured and semi-structured data. But they offer limited means to captu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1973